Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update query API to borrow Arrays instead of consume them #151

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

rroelke
Copy link
Collaborator

@rroelke rroelke commented Aug 14, 2024

This pull request changes the query API to borrow an Array rather than consume it. The entry points of ReadBuilder::new(array: Array) and WriteBuilder::new(array: Array) are changed to ReadBuilder::new(array: &Array) and WriteBuilder::new(array: &mut Array) respectively.

This change is intended to make it easier to submit multiple queries over an iteration. Moving the array into and out of each iteration is a bit of a pain.

The reason that the API originally consumed the Array is because it is unsafe to submit multiple queries against the same open array in parallel in the core library.
https://app.shortcut.com/tiledb-inc/story/35602/add-c-multithreaded-query-example
This story is about that (though it may not contain that detail directly, one of the linked stories might instead).

Since WriteBuilder::new now takes an exclusive reference, the array it borrows cannot be used as long as the query object is in scope - this satisfies the safety guarantee we want.

ReadBuilder::new takes a shared reference, which does allow construction of multiple ReadBuilder objects borrowing the same array. Multiple concurrent executions of tiledb_query_submit from the same thread are not possible since there is no async query support from core (there is an API but we do not currently wrap it and I think Paul suggested it is not stable for some reason?). Multiple concurrent executions of tiledb_query_submit from different threads are only possible if Array is Sync, so we add a test which ensures that Array is not Sync.

@rroelke rroelke changed the title Rr/api query borrow array Update query API to borrow Arrays instead of consume them Aug 14, 2024
@@ -0,0 +1,51 @@
error[E0277]: `Rc<context::RawContext>` cannot be shared between threads safely
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to check this in?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems a little strange for me to assert on compiler output. Is the error format guaranteed to be stable going forward?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can simply assert it fails to compile.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to check this in?

Yep

Seems a little strange for me to assert on compiler output. Is the error format guaranteed to be stable going forward?

I don't imagine it would drift too much but I agree we should not assume it won't change. The trybuild crate makes it easy to recapture (run with TRYBUILD=1).

Maybe we can simply assert it fails to compile.

That is the goal but I think just "it fails to compile" is far too broad. Syntax error is also a failure to compile but doesn't tell me what I'm looking for and possibly hides a success. I'd be happy just checking the error number, but the other option to do that using doc tests did not work for some reason. I'm fine with the extra verbose output that we might have to recapture every once in a while as a tradeoff.

I'd definitely love to use some kind of regexp or something if trybuild had the means...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quoting from https://docs.rs/trybuild/latest/trybuild/

In all of these cases, the compiler’s output can change because our crate or one of our dependencies broke something, or as a consequence of changes in the Rust compiler

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rest of that paragraph seems like an endorsement rather than a reason not to capture the compiler errors.

The beginning of the section seems more on point:

When it comes to compile-fail tests, write tests for anything for which you care to find out when there are changes in the user-facing compiler output. As a negative example, please don’t write compile-fail tests simply calling all of your public APIs with arguments of the wrong type; there would be no benefit.

This test is closer to "calling a public API with the wrong type" than "changes in user-facing compiler output".

I want to test that Array is not Sync. For some reason the doc test compile_fail,E0277 syntax did not work - it correctly checked that compilation failed but the error number was not used for some reason. I would like to have something stronger than "fails to compile" but not as sensitive "if the compiler changes the text of its error message then the test fails and need to evaluate recapturing", but I haven't found an option for that. And I think the over-sensitivity is the better choice.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to test that Array is not Sync. For some reason the doc test compile_fail,E0277 syntax did not work -

As recently as a year ago, that required cargo nightly to work:

,E0515 indicates the error number you expect, which avoids any other errors that would make the test false negative. (This error number check will be ignored if cargo is not running as nightly)

via https://stackoverflow.com/questions/74989492/how-to-assert-that-code-wont-compile-in-a-test

@@ -44,10 +44,10 @@ impl Drop for RawQuery {
}
}

pub trait Query {
fn base(&self) -> &QueryBase;
pub trait Query<'a> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we call 'a 'array instead? Is it meant to represent the lifetime of the array?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is, yep. Happy to do that. Most of the standard library uses 'a but I agree that the lifetime names ought to be more verbose.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with a comment which makes it clear instead of renaming everything

@svilen-mihaylov-db
Copy link
Contributor

Overall looks good. Left a few questions.

Copy link
Collaborator

@davisp davisp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is intended to make it easier to submit multiple queries over an iteration. Moving the array into and out of each iteration is a bit of a pain.

Can you expand on this? I'm not entirely sure what it means and without understanding the motivation I'm having a hard time seeing wha this change accomplishes.

I can say that now that I've read even more Rust written by neither of us, it seems to me that lifetime annotations in public APIs are relatively rare in practice (as in, where we might have defaulted to lifetimes, it seems more like idiomatic Rust uses other primitives like Arc, Rc, and similar constructs). So adding a whole bunch of lifetime annotations in a public API is becoming a bit of a code smell to me.

That said, there are certainly places where lifetimes do show up, but those are seem to be in temporary instances like Iterators that are expected to be consumed directly rather than held onto for indeterminate lengths of time. I wouldn't consider an instance of Query to meet this sort of temporary status given how much internal state it keeps.

Multiple concurrent executions of tiledb_query_submit from different threads are only possible if Array is Sync, so we add a test which ensures that Array is not Sync.

I haven't spent the time trying to figure out whether or not you did something fancy or whether there's a difference between Sync and Send marker traits, but this assertion reminded me that we're already skirting the edges of safety with things like SendArray. Is there something that prevents the same trick from working for SyncArray?

https://github.com/TileDB-Inc/TileDB-Tables/blob/0f4fa1cd230502ea83896364db15d2e295e3be3a/crates/execution/src/execution/insert/mod.rs#L30-L31

@rroelke
Copy link
Collaborator Author

rroelke commented Aug 19, 2024

This change is intended to make it easier to submit multiple queries over an iteration. Moving the array into and out of each iteration is a bit of a pain.

Can you expand on this? I'm not entirely sure what it means and without understanding the motivation I'm having a hard time seeing wha this change accomplishes.

The difff obfuscates it a bit but this is probably the best place to look.

https://github.com/TileDB-Inc/tiledb-rs/pull/151/files#diff-7a7e3252a27e08927b12054d56095a3bd207adc73b9f06bd9320affcab0c1fc5L1365

We open an array. We iterate over a vec of write inputs and turn each into a Write query.

Previously, the construction of the Write query consumes the open array, so we have to get the array back from finalize (see 1474) in order to re-use it for the next iteration. With the change, this is unnecessary because dropping the Write query just un-borrows the array and we can just borrow it again in the next iteration.

This gets more complicated if we want to defer the construction of the Write query to a function inside the loop. The previous API would require passing the array in as a consumed argument and then returning it from the function and then assigning back to the array. This is a bit awkward but it manifests in a more ugly fashion when interacting with the proptest! closure form. The closure passed to proptest! must be immutable, which means that we can't assign to a variable which is captured from the outer scope. And this means that we can't open an array once and then run queries against it inside a proptest closure without doing the borrow. (I just remembered this which is why it is not in the description)

@rroelke
Copy link
Collaborator Author

rroelke commented Aug 19, 2024

I can say that now that I've read even more Rust written by neither of us, it seems to me that lifetime annotations in public APIs are relatively rare in practice (as in, where we might have defaulted to lifetimes, it seems more like idiomatic Rust uses other primitives like Arc, Rc, and similar constructs). So adding a whole bunch of lifetime annotations in a public API is becoming a bit of a code smell to me.

That said, there are certainly places where lifetimes do show up, but those are seem to be in temporary instances like Iterators that are expected to be consumed directly rather than held onto for indeterminate lengths of time. I wouldn't consider an instance of Query to meet this sort of temporary status given how much internal state it keeps.

Using Rc unfortunately wouldn't work well with Write queries which have &mut Array here.
https://doc.rust-lang.org/std/rc/struct.Rc.html#method.get_mut

If Write query construction required that the Rc<Array> moved into it were exclusive, then it could get the mutable reference. But then the client code wouldn't have their own Array handle, duplicating the problem that this PR wants to solve.

@rroelke
Copy link
Collaborator Author

rroelke commented Aug 19, 2024

Multiple concurrent executions of tiledb_query_submit from different threads are only possible if Array is Sync, so we add a test which ensures that Array is not Sync.

I haven't spent the time trying to figure out whether or not you did something fancy or whether there's a difference between Sync and Send marker traits, but this assertion reminded me that we're already skirting the edges of safety with things like SendArray. Is there something that prevents the same trick from working for SyncArray?

We (or someone else) can do unsafe impl Sync for SyncArrayOrWhatever, sure. But then it's up to us (or whoever else) to make sure that it is used properly, just as we are doing with unsafe impl Send for SendArray. I think doing this unsafe impl Sync carries a lot more risks with the underlying Context though, in addition to the possible concurrent executions of tiledb_query_submit.

@rroelke
Copy link
Collaborator Author

rroelke commented Sep 6, 2024

Another reason to change the API in a fashion like this -
If your query returns an error then the Array is dropped and you have to re-open it to do anything.
With this change, or something else like it (including the suggested Rc), then the array is not dropped if the query returns an error.

@davisp
Copy link
Collaborator

davisp commented Sep 6, 2024

Re-reading this discussion after a few weeks and more time spent in other Rust code bases, I'm going to double down on "lifetime annotations that spider through a code base seems like very non-idiomatic Rust" stance.

However, I'm pretty sure this is a text book case for using a RefCell and its try_borrow_mut method. The one thing I think you'd want that I don't have an immediate answer to is how to hold a mutable reference for the lifetime of the Query instance. Once we give a query that ref cell, nothing else should be able to borrow it (I think). Where a naive approach would only have the query hold a mutable ref during each method call, which would allow for folks attempting to interleave method calls on two different Query instances using a RefCell to the same Array which would be bad.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants